Heterogeneous Distributed Big Data Clustering on Sparse Grids
نویسندگان
چکیده
منابع مشابه
Entropy-based Consensus for Distributed Data Clustering
The increasingly larger scale of available data and the more restrictive concerns on their privacy are some of the challenging aspects of data mining today. In this paper, Entropy-based Consensus on Cluster Centers (EC3) is introduced for clustering in distributed systems with a consideration for confidentiality of data; i.e. it is the negotiations among local cluster centers that are used in t...
متن کاملEfficient Regression for Big Data Problems using Adaptive Sparse Grids
The amount of available data increases rapidly. This trend, often related to as Big Data challenges modern data mining algorithms, requiring new methods that can cope with very large, multi-variate regression problems. A promising approach that can tackle non-linear, higher-dimensional problems is regression using sparse grids. Sparse grids use a multiscale system of grids with basis functions ...
متن کاملCollective, Hierarchical Clustering from Distributed, Heterogeneous Data
This paper presents the Collective Hierarchical Clustering (CHC) algorithm for analyzing distributed, heterogeneous data. This algorithm rst generates local cluster models and then combines them to generate the global cluster model of the data. The proposed algorithm runs in O(jSjn 2) time, with a O(jSjn) space requirement and O(n) communication requirement, where n is the number of elements in...
متن کاملk-Means for Streaming and Distributed Big Sparse Data
We provide the first streaming algorithm for computing a provable approximation to the k-means of sparse Big data. Here, sparse Big Data is a set of n vectors in R, where each vector has O(1) non-zeroes entries, and d ≥ n. E.g., adjacency matrix of a graph, web-links, social network, document-terms, or image-features matrices. Our streaming algorithm stores at most logn · k input points in memo...
متن کاملDistributed Application Management in Heterogeneous Grids
Distributing an application on several machines is one of the key aspects of Gridcomputing. In the last few years several groups have developed solutions for the occurring communication problems. However, users are still left on their own when it comes to the handling of a Grid-computer, as soon as they are facing a mix of several Grid software environments on target machines. This paper presen...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Algorithms
سال: 2019
ISSN: 1999-4893
DOI: 10.3390/a12030060